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CROSSBRED OR HYBRID 
Progeny thai r«uh from ihc crow 
of two parental Hno or brrcdj. 

QUANTTTATI\X TRAIT LOCI 
(QTL).Grnetirkxior 
chromosomal regions that 
contribute to variability in 
complex quantitative tram ( such 
as plant height or body weight), 
as identified br Juristical analysis. 
Quantitative traits are typically 
affected by several genes, and the 
environment. 
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THE USE OF MOLECULAR GENETICS 
IN THE IMPROVEMENT OF 
AGRICULTURAL POPULATIONS 



Jack C M. Dekkers* and Frederic Hospital 1 

Substantia! advances have been made in the genetic improvement of agriculturally important 
animal and plant populations through artificial selection on quantitative traits. Most ol this 
selection has been on the basis of observable phenotype, without knowledge of the genetic 
architecture of the selected characteristics. However, continuing molecular genetic analysis of 
traits in animal and plant populations is leading to a better understanding of quantitative trait 
genetics. The genes and genetic markers that are being discovered can be used to enhance the 
genetic improvement of breeding stock through marker-assisted selection. 



MULT I FACTOR') A I GEN ET1G S ; © 
Genetic improvement through artificial selection has 
been an important contributor to the enormous 
advances in productivity that have been achieved over 
the past 50 years in plant and animal species that are of 
agricultural importance (FIG. I). Most of the traits that 
are selected on are complex quantitative traits, which 
means that they are controlled by several genes, along 
with environmental factors, and that the underlying 
genes have quantitative effects on phenotype. So far, 
most selection has been on the basis of observable phe- 
notype, which represents the collective effect of all genes 
and the environment 

Sophisticated testing and selection strategies have 
been developed and implemented for many species, with 
the aim of improving the genetic performance in a breed 
or line through recurrent selection or introgression 
(BOX i). Another goal is to develop superior crossbredsor 
hybrids through the combination of several improved 
lines or breeds. Until recently, these selection pro- 
grammes were conducted without any knowledge of the 
genetic architecture of the selected trait. Andersson' and 
Mauricio 1 recently reviewed how molecular genetics is 
used to discern the genetic nature of quantitative traits in 
animal and plant species, respectively, by identifying 
genes or chromosomal regions that affect the trait — so- 
called QUANTrTATm tbait loet (QTL). The purpose of this 



article is to show how this information can be used to 
enhance genetic improvement of agriculturally impor- 
tant species. Our emphasis is on the use of natural varia- 
tion in a species, rather than on the introduction of new 
genetic variation through genetic modification, although 
some of the programmes reviewed, such as introgres- 
sion, are also important in the introduction of trans- 
genes into breeding populations. 

The quantitative genetic approach 

The quantitative genetic approach to selection is based 
on knowledge of population genetic parameters for the 
traits of interest, such as HERrrABiLiriEs, genetic variances 
and genetic correlations 5 . These parameters can be esti- 
mated using statistical analysis of phenotypic data from 
pedigrees 4 . However, the genetic architecture of the trait 
itself is treated as a black box, with no knowledge of the 
number of genes that affect the trail, let alone of the 
effects of each gene or their locations in the genome. 
More specificaDy, quantitative genetic theory is based on 
Fisher's infinitesimal genetic model 3 , in which the trait is 
. assumed to be determined by an infinite number of 
genes, each with an infinitesimally small effect. On the 
basis of this model, the expected increase in mean per- 
formance of a population per generation through 
genetic selection is proportional to the accuracy with 
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Figure 1 1 Examples of genetic improvements in livestock and crops, a | Average mfc 
production per lactation of US Hobtein cows has nearly doubled during the past 40 years, as 
shown by the t op ine (phenotypic yield) (Animal Improvement Programs Laboratory; 
ttp7/aipl.arsusda.cpv/purMrend/tnd1 1 .H). More than halt ot this has been due to rnproved 
genetics, as shown by the bottom Ine. which plots the progression ol the population average 
genetic value lor milk yield, b | For com, yields have increased fourfold during the past GO years 
(hnp/AAwwusdagcw/nass/ao^ Although yields have fluctuated from year to 

year, primariry due to weather, there has been a consistently increasing trend, as shown by the 
regression frie (dashed). Again, more than hall ol this ncreased yield has been a result of genetic 
improvement 62 . 



HERrTABrUTY 

The friction of the phenotypk 
variance that is due to additive 
genetic variance. 

GENETIC VXJt IaNCE 
Variation in i nail m a 
population that beamed by 
genetic differences. 

GENETIC CORRELATION 
The correlation between mils 
that b cawed by genet ic as 
opposed to environmental 
(actois. A genetic correlation 
between two traits results if the 
same gene affects both traits 
tpleiotropy) or if penes that 
affect the two ti aits ate in tint age 
disequilibrium. 



which the breeding value of selection candidates can be 
estimated, the intensity of selection and the genetic vari- 
ation in ihe population (see also the review by Barton 
and Keighdey on p. 1 1 of this issue). 

Despite the obvious Daws of the infinitesimal model, 
the tremendous rates of genetic improvement that have 
been achieved (Fie. I) attest to the usefulness of the 
quantitative genetic approach. Nevertheless, quantita- 
tive genetic selection has several limitations, due to the 
phenotype being an imperfect predictor of the breeding 
value of an individual, possibly unobservable in both 
genders or before the time when selection decisions 
must be made, and not very effective in resolving nega- 
tive associations between genes, such as those caused by 
linkage or epistasis.The ideal situation for quantitative 
generic selection is that the trait has high herilability and 



that the phenotype can be observed in all individuals 
before reproductive age. This ideal is hardly ever 
achieved (Table i), which limits the effectiveness of 
quantitative genetic selection. However, because DNA 
can be obtained at any age and from both genders, mol- 
ecular genetics can alleviate some of these limitations, as 
will be discussed below. 

Whereas selection in breeding populations primarily 
focuses on additive genetic effects, the non-additive 
effects of heterosis or hybrid vigour, which are observed 
when lines or breeds are crossed, have also contributed 
greatly to the performance of livestock and crops. In the 
absence of any molecular data, breeding programmes 
that are aimed at producing new and improved hybrids 
or crossbreds are largely based on extensive testing by 
trial and error. In plants, lines have been placed in a lim- 
ited number of heterotic groups, which, when crossed, 
typically result in substantial hybrid vigour. 

How can molecular genetics help? 

Molecular genetic analyses of quantitative traits lead to 
the identification of two broadly different types of 
genetic loci that can be used to enhance genetic 
improvement programmes: causal mutations and pre- 
sumed non functional genetic markers that are linked 
to QTL (indirect markers). Causal mutations for quan- 
titative tiaits are hard to find, difficult to prove and few 
examples are available 1 . By contrast, non- functional oi 
anonymous polymorphisms are abundant across the 
genome and their linkage with QTL can be established 
by evidence of empirical associations of marker geno- 
types with trait phenotype. Two approaches are used to 
identify indirect markers 1 : directed searches using can- 
didate-gene approaches in unstructured populations*; 
and genome-wide searches in specialized populations, 
such as F 2 crosses. Because candidate-gene markers 
focus on polymorphisms in a gene that are postulated 
to affect the trait, they are often tightly linked to the 
QTL. A candidate-gene marker can occasionally repre- 
sent the functional variant itself, although this is difficult 
to prove 1 . Genome scans, conversely, can only identify 
regions of chromosomes that affect the trait. The length 
of these regions is typically 10-20 cM, but the exact 
position and number of QTL in the region is unknown. 

Whereas causative polymorphisms give direct infor- 
mation about genotype for the QTL, the use of indirect 
markers for QTL mapping and for selection is based on 
the existence of linkage disequilibrium (LD) between the 
marker and the QTL. Marker-QTL LD can exist at the 
population level but always exists within families, even 
between loosely linked loci {BOX 2). Although two loci are 
expected to be in population-wide equilibrium in large 
random-mating populations, partial population-wide 
LD can exist by chance between tightly linked loci in 
breeding populations that are under selection. 
Population- wide LD can also be created by crossing lines 
or breeds. Although LD will then exist even between 
looser)' Linked loci, this LD will erode rapidly over gener- 
ations. Indirect markers that are identified using the can- 
didate-gene approach are expected to be in substantial 
LD with the QTL with which they are associated. Unless 
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Box 1 1 Genetic tmprovement of agricultural species 



Two important strategies for genetic improvement are recurrent selection and 
introgression programmes. The aim of a recurrent selection programme, which is the 
main vehicle for genetic tmprovement in livestock, is to improve a breed or line as a 
source of superior germplasm for commercial production through wittun-breed or 
within-line selection {part a of figure). This involves recording the phenotypes of 
numerous individuals and the use of these phenotypes to estimate the 'breeding 
value' of selection candidates. An example of performance testing is the progeny test, 
in which the breeding values are estimated on the basis of the phenotype of progeny 
that have been created through test matings. Test matings can be to individuals of the 
same breed or line if the aim is to improve pure-bred performance, or to individuals 
from another breed or line if the objective is to improve crossbred or hybrid 
performance. Improvement of stock for commercial production often involves 
further product development through testing, breeding or crossing to generate 
crossbreds or hybrids. 

Introgression is another important genetic improvement strategy, in particular in 
plants (part b of figure). The aim of an introgression programme is to introduce a 
'target' gene, which can be a single gene, a quantitative trait locus or a transgenic 
construct, from an otherwise low-productivity line or breed (donor) into a 
productive line that lacks that particular gene (recipient; R). Introgression starts by 
crossing the donor and recipient lines, followed by repeated backcrosses (BC) to the 
recipient line to recover the recipient-line genome. The target gene is maintained in 
the backcross generations through selection of donor gene carriers. Recovery of the 
recipient genome can be enhanced by the selection of backcross individuals that have 
a high value for the recipient trait phenotype. Note that genetic improvement for this 
trait can be maintained by continuing recurrent selection in the recipient line 
(vertical arrows). Once a sufficient proportion of the recipient genome is recovered, 
the backcross line is intercrossed (to generate IC lines), and donor gene homozygotes 
are selected tb fix the target gene. This might require more than one generation to 
obtain sufficient individuals for further breeding or if several target genes must be 
introgressed. The effectiveness of introgression schemes is limited by the ability to 
identify backcross or intercross individuals with the target gene and by the ability to 
identify backcross individuals that have a high proportion of the recipient genome, 
in particular in regions around the target gene 60 . 
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the functional polymorphism has been identified, how- 
ever, linkage phase of a candidate-gene marker with the 
functional variant can differ from one population to the 
next and must, therefore, be assessed in the population 
in which it will be used. Although more abundant and 
extensive, within-family LD is more difficult to use 
because linkage phases between the markers and QTL 
wiD not be the same in all families and must, therefore, 
be assessed on a within -fa miry basis. 

The use of molecular genetics in selection pro- 
grammes rests on the ability to determine the genotype 
of individual for causal mutations or indirect markers 
using DNA analysis. This information is then used to 
assess the genetic value of the individual, which can be 
captured in a molecular score that can be used for selec- 
tion. This removes some of the limitations of quantita- 
tive genetic selection discussed above {TABLE i). 

It is clear that the use of molecular data for genetic 
improvement would be most effective if the genetic 
architecture of a quantitative trait was completely 
transparent, such that we knew the number, the posi- 
tions and the effects of all the genes involved. In that 
case, the process of selection would be reduced to a 
simple 'building block* problem (genotype building) 
of selection and mating to create individuals with the 
right combination of alleles at each QTL. However, 
this situation is far from reality and might never be 
achieved; although advances in molecular genetics 
have been able to partially explain the 'black box' of 
quantitative traits, the information provided by mole- 
cular data is incomplete, for three main reasons. First, 
in most cases, only a limited number of genes that 
affect the trait has been identified, albeit the ones with 
larger effects. A substantial part of the black box there- 
fore remains obscure, and selection exclusively on 
genotype for.identified QTL would not result in a 
maximum response to selection. Instead, selection on 
molecular score must be combined with selection on 
phenotype, which reflects the collective action of all 
genes, including those that have not been identified. 
Second, with indirect markers, selection is not directly 
on the QTL, but on the marker, through LD. As LD 
erodes in the course of the selection programme 
owing to recombination, the efficiency of selection is 
reduced. Third, for both causal and indirect markers, 
the effects of the QTL must be estimated empirically 
on the basis of statistical associations between markers 
and phenotype. So, the use of molecular information 
does not remove the need for phenotypic information 
and, therefore, suffers to some degree from the same 
limitations as quantitative genetic selection. 

Application of molecular data 

Despite the limitations outlined above, molecular 
genetic information can be used to enhance several 
breeding strategies through what is broadly referred 
to as marker-assisted selection (MAS). All strategies 
for MAS are based on the use of a molecular score, 
although the composition of this score differs from 
application to application (TABLE 2). In addition to 
those described below, the applications of molecular 



~24~] JANUARY 2002 I VOLUME 3 



9£ <D 2001 Macmillan Magazines Ltd 



wMrw.natuie.com/ifviews/genetics 



o 



o 



REVIEWS 



Table 1 j Limitations of quantitative genetic selection and opportunities for the use of molecular data 



Limit to quantitative 
selection 

Phenotype is a poor 
predictor ol breeding 
value (tow herit ability). 

Pnenotype is difficult or 
expensive to record. 

Phenotype expressed 
subsequent to 
reproductive age. Long 
generation interval. 

Individual has to be 
sacrificed to score its 
phenotype. 

Traits observed only in 
one gender. 



Genetic potential is 
masked by epistatic 
interactions between 
OTL. or by linked OTL 
that are in repulsion 



Genotype-environment 
interactions. 



Example trart(s) 

Reproduction in 
animals, yield in 
plants. 

Disease- related traits. 



Reproduction traits in 
animals, grain yield 
in plants. 
Tree breeding. 

Meat quafity An 
animals, marling 
quafity in barley. 

Milk yield in dairy 
cattle. 



Marry traits. 



Many traits. 



Help provided by 
molecular data 

Better estimate of 
breeding value at 
identified QTL. 

Markers are easier or 
cheaper to score than 
phenotype. 

Molecular score is 
available at earfier 
stage, resulting in 
faster selection. 

Molecular score is 
available on afl 
selection candidates. 

Molecular score is 
available at an earty 
age on both genders. 



Dissect and break 
down unfavourable 
interactions at the 
genetic level. 



Predict interactions 
at the genetic level. 



Possible breeding solution 

Select on molecular score 
and phenotype. 

Select on molecular score. 



Select on molecular score in 
combination with phenotype 
of ancestors. 

Select on molecular score in 
cxxnbination with phenotype 
of relatives. 

Select on molecular score 
and phenotype. Pre- select on 
molecular score for further 
phenotypic testing (lor 
example, progeny test). 

Select on molecular score 
and phenotype. 



Select on molecular score 
and phenotype. 



Economic merit of 
molecular data 

Depends on requirements 
lor QTL detection. 
Difficult to prove. 

Proportional to cost of 
phenotyping versus 
genotvping. Easy to prove. 

Allows more rapid genetic 
gain and earfier release of 
improved genetic material. 

Substantial increase in 
genetic gain expected. 

Moderate, depending on 
opportunity tor and costs of 
pre- selection. 



Difficult but can be 
spectacular if successful. 



Unknown. 
Difficult to prove. 



OTL. quantrtalrve trait locus. 



BREEDING VALUE 
A measure of the value of an 
individual foi breeding purposes, 
as assessed by the mean 
performance of iis progeny. 

HETEROSIS OR HYBRID VIGOUR 
When a hybrid or crossbred 
individual has a higher 
performance than the avrrage of 
its two parents ( the animal 
breeding definition), or than the 
best parent (the plant breeding 
definition). This is the result of 
non- additive actions of genes 
(( over- )domina nee and/or 
epistasis). 

LINKAGE DISEQUILIBRIUM 
(LD). The condition in which the 
frequency of a particular 
haplorype for two loci is 
significantly different from that 
expected under random mating. 
The expected frequency b the 
product of observed aUebc 
frequencies ai each locus. 

LINKAGE PHASE 

The arrangement of alleles at two 
loci on homologous 
chromosomes. For example, in a 
diploid individual with genotype 
Mm at a marter locus and 
genorype Qq at a quantitative 
trait fcx us, possible linkage 
phases ate MQ/mq and Mq/mO. 
fot which '/'separates the two 
homologous chromosomes. 



data in genetic programmes include their use for 
parentage verification or identification (forexample, 
when mixed semen is used in artificial insemination), 
and in genetic conservation programmes to identify 
unique, genetic resources and quantify genetic 
diversity. 

Genotype building programmes. If many QTL are 
known, and favourable alleles are present in different 
Lines or breeds, genotype building strategies can be used 
to design new genotypes that combine favourable aJJeles 
at all loci. Selection is then based on the molecular score 
alone, which is determined by the genotype at those loci 
(possibly estimated through indirect markers), along 
with (if possible) information on linkage and linkage 
phase between those loci. Starting from a cross between 
two parental lines, the simples! genotype building strat- 
egy involves screening a population for individuals that 
are homozygous at the relevant loci 7 . More than one 
generation of mating and selection might be needed to 
produce individuals that are homozygous for a larger 
number of loci'-'. In certain crop species, doublf-haploid 
idh> lines are used, which provide homozygous recombi- 
nant genotypes in a single step, but these are not avail- 
able in animals. 

When more than two parental lines are involved, gene 
pyramiding can be used to create individuals that are 
homozygous at all loci. Gene pyramiding involves multi- 
ple initial crosses between several parents (FIC.2). Because 
the above strategics involve several generations of specific 
mating* and the production of numerous offspring, they 
are more applicable to plants than animals. 



Introgression programmes. Introgression is a simple form 
of genotype building, in which a target gene is introduced 
into an otherwise productive, recipient line (BOX I). 
Molecular markers can be used in both the backcrossing 
and the intercrossing phases of such programmes. The 
effectiveness of the backcrossing phase can be increased in 
two ways (TABtE 2): by identifying carriers of the target 
gene (foreground selection); and by enhancing recovery 
of the recipient genetic background (background selec- 
tion). Strategies for foreground and background selection 
have been the subject of several publications (for a recent 
review, see REF. 10). During the intercrossing phase, mark- 
ers can be used to select individuals that are homozygous 
for the target gene. For multiple QTL, introgression can 
be combined with gene pyramiding to decrease the num- 
ber of individuals required 1 u *. 

In addition to requiring extra resources, an intro- 
gression programme diverts some selection pressure 
away from other traits of economic importance. To 
compensate for this, the benefit of the target gene must 
be greater than that which could be achieved by regular 
selection over the same period. Only genes with a large 
effect will meet this requirement". 

Recurrent selection programmes. For a single marker, the 
molecular score of an individual for use in recurrent 
selection is obtained as the estimate of the statistical 
association between marker genotype and phenotype 
(TABtE 2). For multiple markers, genotype effects can be 
summed over all markers into a single molecular score'*. 
In addition to the molecular score, phenotypic informa- 
tion will be available on the selection candidate itself 
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Box 2 1 Selection programmes based on linkage disequilibrium 



Markers ibat arc tightly linked to a quantitative trait locus (QTL) can be in complete or 
partial population -wide Bnkage disequilibrium (LD) with the QTL, such that some 
marker- QTL haplotypes are more frequent than expected by chance (for example, MQ 
and mq versus Mq and mQ) (part a of figure). In this case, selection can be directly on 
marker genotype. The probability of population - wide LD is higher for closely linked 
markers and in selected populations of small effective size, which is the case for 
agricultural species**. Population wide ID can also be created by crossing (ideally inbred) 
lines or breeds and will then exist between loosely linked markers for several generations 
(part b of figure). When a marker and a QTL are in linkage equilibrium, ail marker-QTL 
haplotypcs are present and at random-mating frequencies, and marker genotype gives no 
information about QTL genotype (part c of figure). This will be the case for most linked 
markers in an outbreeding population. However, the marker and QTL will be in partial 
disequilibrium within a family. The extent of within-family disequilibrium depends on 
the recombination rate (r),but will occur even with loose Bnkage (for example, r= 0.2). 
This dtsequifibrium can be used to detect QTL and for selection on a within family basis. 



WBKn-larnty 
disequiibrujn 




MOLECULAR SCORE 
A score thai quantifies the value 
of an individual for selection 
purposes derived on the basis of 
rootfcubi genetic data. 

PROGENY TESTING 
Evaluzi ion of the breeding value 
of an individual based on the 
mean performance of its progeny. 

DOUBLE- HAFLOID LINE 

( DH line). A popuhtion of rufly 

homozygous tndrvidualslhat is 

obtained by artificially 'doubling' 

the gametes produced by an F ( 

hybrid. 

BACKCROSS 

Crossing a crossbred populat ion 
back to one of its parents. 



and/or its relatives. Given these alternative sources of 
information, three strategies for the selection of candi- 
dates for breeding can be distinguished: selection on mol- 
ecular score alone; selection on molecular score followed 
by selection on phenotype; and combined selection on an 
index of the molecular score and the phenotype. 

Selection on molecular score alone will result in less 
genetic improvement than combined selection on mol- 
ecular score and phenotype, unless the molecular score 
captures all genetk variation or the phenotypic records 
provide no information to differentiate selection candi- 
dates. A prime example of the latter is when one or 
more members from a family must be selected before i< 
is possible to collect phenotypic information that allows 
their breeding values to be differentiated (FIG. 3). This 
provides ideal opportunities for MAS because markers 
are used at a stage of the continuing selection pro- 
gramme that is underused, as quantitative selection at 
that stage is ineffective. So, apart from the extra resource 
requirements, this is a rather risk-free approach, with 
limited impact on response to quantitative genetic selec- 
tion. Opport unities to use MAS in this manner are cru- 
cially influenced by reproductive rates (see below). 



If informative phenotypic data are available along 
with molecular data, selection on a combination of 
molecular score and phenotypic information is the 
most powerful strategy. Methods to derive an index for 
combined selection were developed by Lande and 
Thompson 14 using selection index theory. The index 
optimally weights molecular score and phenotypic data 
such that the accuracy of the index as a predictor of the 
selection candidate s breeding value is maximized. 
Combined selection is most effective when phenotypic 
information is limited because of low heritability or 
inability to record the phenotype on all selection candi- 
dates before selection 15 . The paradox is that the ability to 
detect QTL, which also requires phenotypic data, is also 
limited for such cases". So, unless different resources or 
strategies are used for QTL detection, the greatest 
opportunities for MAS might exist for traits with mod- 
erate rather than low heritabiBty. 

Crossbred or hybrid performance. In theory, crosses 
between lines that are genetically distant are expected to 
show greater hybrid vigour or heterotic effects than those 
between more closely related lines, because differences in 
allele frequencies between genetically distant lines are 
expected to be greater. Genetic distance can be measured 
from differences in allele frequencies at anonymous 
markers spread throughout the genome. Evaluation of 
this concept for many crops' 7 shows that marker- based 
prediction of hybrid performance can be efficient if 
hybrids include crosses between lines that are related by 
pedigree or which trace back to common ancestral popu- 
lations. By contrast, prediction is not efficient for crosses 
between lines that are unrelated or that originated from 
different populations, because the associations (through 
LD) between marker loci and QTL that are involved in 
heterosis are not the same in the different populations 1 '. 

The limited ability to predict hybrid vigour in 
untested crosses has motivated the development of 
strategies that use the knowledge of QTL effects to gen- 
erate crosses that are predicted to create QTL genotypes 
with favourable non-additive effects. An example is the 
use of marker- based statistical methods to predict the 
performance of untested crosses from the performance 
of parental lines in a limited number of test crosses 19 . 

State of the art 

In contrast to the past decades, when almost no markers 
were available and breeding was mostly based on selec- 
tion on phenotype, an ideal view of the future could be 
that the location and function of all genes that affect 
quantitative traits are known. Genotype building strate- 
gies could then be applied directly on those genes and 
tedious phenotype scoring would no longer be neces- 
sary. This, however, assumes that the effects of those 
genes are known with precision and are consistent; for 
example, in different environments and genetic back- 
grounds. Although this is far from the case, some geno- 
type building strategies are already routinely used (at 
least in plants) to manipulate genes of large effect or 
transgenic constructs; for example, in introgression pro- 
grammes. Howeyer, as theoretical and experimental 
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Table 2 j Strategics for the use of molecular data in genetic improvement programmes* 

Composition of molecular score 



Genotype building 
Pyramicfing 

Introgression 
Foreground selection 
Background selection 



Intercross selection 
Recurrent se faction 



Crossbreeding or 
hybrid production 

Choice of breeds or 
lines to cross 



Information required to 
compute molecular score 



Genotypes at target bet 1 . 
(Linkage between target loci.) 
(Linkage phases between target loci.) 



Genotypes at target toe? . 

Genotypes at marker loci across 
genome. 

(Linkage between markers.) 
(Linkage with target loci.) 

Genotypes at target bcF . 

Genotypes at QTL 1 or markers. 
Estimates of QTL or marker effects. 

(Linkage phases between QTL.) 



Allele frequencies at marker loci 
across genome. 

Genotypes at QTL or markers and 
QTL or marker effects. 



Presence-absence of target aBeles. 
(MocSfied by finkage phase for linked 
target loci.) 



Presence-absence of target alleles. 
Proportion of recipient alleles. 

(Proportion of recipient genome.) 
(Greater emphasis on markers linked 
to target toci.) 

Number of target alleles. 

Sum of effects for genotypes at QTL 
or markers. 

(Modified by linkage phase between 
tightly linked QTL.) 5 



Genetic distance between pairs of 
breeds or lines . 

Sum of effects for predicted genotypes 
at QTL or markers. 



Selection or decision criterion 



Molecular score. 



Molecular score. 

Molecular score. 

Index of molecular score and 

recipient trait phenotype. 



Molecular score. 

Molecular score. 
Molecular score followed by 
phenotype. 

Index of molecular score and 
phenotype. 



Molecular score. 
Molecular score. 



'Items in brackets are optional. 'This must be derived from Inked markers if the functional gene has nol been mapped. 'REF.64. 



SELECTION INDEX THEORY 
Theory of select ion thai 
combines several nails or 
sources of information, soch 
that the accuracy of the ind«* as 
a predictor of the selection poaJ 
( for example, the breeding • 
value) b maximized. 



. results of QTL detection have accumulated, the initial 
enthusiasm for the potential genetic gains allowed by 
molecular genetics has been tempered by evidence for 
limits to the precision of the estimates of QTL effects. 
The present mood is one of 'cautious optimism' 70 . 

Today, a database literature search for 'marker- 
assisted selection* provides hundreds of hits, but, in most 
cases, MAS is mentioned only as a future perspective. 
Others have evaluated the potential of MAS using com- 
puter simulation. Overall, there are still few reports of 
successful MAS experiments or applications. Most refer 
to the use of molecular markers in genotype building 
programmes, at various levels of complexity. Successful 
reports include marker -assisted background selection 
with introgression of genes for which the functional 
variant is known, or which have clearly identifiable phe- 
notypic effects. Examples are the introgression of the Bt 
transgene into different maize genetic backgrounds 21 , of 
the Apor-nuH allele in mice 22 , and of the nak\l iurlc gene 
in chickens" (FIG. 4). Marker-assisted introgression of 
such 'known' genes is now widely used in plants, in par- 
ticular by private plant-breeding companies. However, 
even in this case, more work is needed to optimize the 
information provided by markers* and reduce costs 2 ws . 
Other reports on genotype building using known genes 
include the 'pyramiding' of several major disease resis- . 
tance genes in rice 2 *- 27 . Although a good knowledge of 
the spectrum of gene effects is necessary for the pyra- 
miding of multiple resistance genes, it is a proven valu- 
able step towards more durable and stable resistance, 
which could hardly be achieved without markers. 
Moreover, the use of markers provides a better under- 
standing of interactions between the introgressed genes. 



The experience of introgression of QTL using indi- 
rect markers in foreground selection is quite different. 
In general, introgression has resulted in improvement 
of the targeted traits but, with few exceptions (for 
example, see REF. 2ft), levels of improvement were below 
the expectations based on estimates of QTL effects 
from the detection phase. The reasons for this under - 
performance include inaccurate estimates of QTL loca- 
tion 29 , QTL that were lost or not controlled in the pro- 
gramme 30 , negative epistatic interactions between 
QTL 31 , or strong genotype-environment interac- 
tions 32 - 33 . Similar results were obtained for the intro- 
gression of three QTL for trypanotolerance in mice by 
gene pyramiding 34 , which represents the only report of 
marker-assisted foreground selection of QTL in ani- 
mals; the markers proved useful to control the QTL 
genotype during the backcrossing phase, but the effects 
of the QTL in the new background were not always 
consistent with those observed during the QTL detec- 
tion phase. 

The general conclusion to be drawn from these 
results is that for complex traits that are controlled by 
several QTL of moderate or low effect, ot that are sub- 
ject to high environmental variation, genotype-environ- 
ment interactions, epistasis between QTL or eptstasis 
between QTL and the genetic background, it is risk)' to 
carry out selection solely on the basis of marker effects, 
without confirming the estimated effects by phenorypic 
evaluation. This is true in particular if QTL were initially 
detected in a different population or genetic back- 
ground. 

Although no documented reports are available, 
industrial applications of molecular data in livestock are 
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RECOMBINANT INBRED LINE 
A populai ion of tuUy 
homozygous individuals that b 
obtained by repealed setting 
from an F, hybrid, and that 
comprise -50% ol each 
parental genome in different 
combinations. 

NEAR- ISOGENIC UNt 
Linei that are genef icaJJy 
tdetficaJ, oaept lot one locus o? 
chromosome segment. 
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Selection ol 
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Selection of 
homozygotes 
forG3 +G4 
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Sdeclion of 
homozygotes for 
G1 -» G2 * G3 4 G4 



Ftgue 2 j Gene pyramiding. This example shows how tour 
genes {G 1 -G4), which are present h fou cftferent Ines 
(L1 -4.4). can be combined rrto a singfe Ine in a two-step 
procedure. In the first step, two ines are developed, which are 
each homozygous for two target genes (G1 . G? and G3. G4). 
by crossing pairs of fines. This is followed by construction of F ? . 

RECOMBINANT INBRED LINE (R1L), Of CiOUbfe-hapfoid (DH) 

progeny and selection of homozygotes. tn the second step, 
such individuals are aossed to produce lines that are 
homozygous for al four target genes. Selection of 
IxHDozygotes can be on the basis of Inked markets. This 
process can be expanded to more than four genes by 
expanding the pyt amid. 



limited and haw mainly been in the context of recurrent 
selection programmes, which are the principal vehicles 
for genetic improvement in animals. A mixture of causa) 
and indirect markers is used. In swine, the indirect 
markers used were primarily identified by using candi- 
date-gene approaches or positional cloning, whereas in 
dairy cattle, indirect markers identified using genome 
scans are also used. This species difference is partially 
explained by the different strategies that are used for 
QTL detection. In swine, genome scans are primarily 
based on crosses between divergent lines. These identify 
QTL that differ between breeds but haw limited direct 
application for wi thin-breed selection. Direct access to 
closed breeding populations has, however, made candi- 
date-gene approaches relatively successful. In dairy cat- 
tle, QTL detection capitalizes on the large half-sib family 
sizes thai result from extensive use of artificial insemina- 
tion'. This aJlows genome scans to detect QTL that seg- 
regate within rather than between breeds. 



Most applications of MAS in livestock are geared 
towards cautious use that does not jeopardize the 
genetic gains that can be obtained by conventional 
selection, for example in pre- selection (FIG. 3). Other 
uses are for traits that are difficult to improve by con- 
ventional means because of low heritability (for exam- 
ple, the use of an oestrogen receptor gene marker to 
select for litter size in swine 55 ), or traits that are difficult 
to record (for example, traits that are related lo disease 
resistance or meal quality). 

CtiaDenges and future prospects 

Statistical aspects of MAS. Most applications of genetic 
markers in selection programmes are preceded by an 
analysis aimed at QTL detection, and only QTL that are 
shown to have a significant effect on phenotype are sub- 
sequently used for selection. This raises two important 
statistical issues: the setting of statistical thresholds for 
deciding which QTL to use; and dealing with the inher- 
ent overestimation of QTL effects. 

For QTL detection, very stringent methods are used 
to control the false- positive eiror rate, as suggested by 
Lander and Krugryak 3 *. Several studies have, however, 
shown that greater gains from MAS can be obtained by 
allowing a higher rate of false positives, to increase the 
power to detect QTL effects and reduce the number of 
false- negative results 1 *- 77 . So, alternative strategies (for 
example, see ref.38) are needed to more adequately bal- 
ance the cost of false -positive against false- negative 
results for MAS. This balance might differ depending 
on the particular application. Thresholds could be low- 
ered even further if proper statistical methods were 
used to account for the degree of uncertainty about 
estimates of QTL effects. For example, Meuwissen et 
a/. 3 * obtained a molecular score with high predictive 
ability on the basis of high -density marker genotyping 
data by using all estimated marker effects, regardless of 
their statistical significance. 

Overestimation of QTL effects has been shown to 
occur both by theory 4041 and by experimentation 43 
(see also the review by Barton and Keightley on p. 1 1 
of this issue). Overestimation of QTL effects leads to 
too much emphasis on molecular scores in selection 
relative to phenotypic data, and results in a less than 
optimal response to selection. In part, biases are 
caused by the use of only significant QTL effects, and 
they can be reduced, although not entirely removed 41 , 
by re-estimation of significant QTL effects in an 
independent sample. A less-biased estimate of QTL 
effects can be obtained using near. isogenic lines 43 , but 
the generation of such lines is a long and difficult 
process. Alternative statistical methods for the analy- 
sis of QTL data that avoid overestimation or reduce 
their impact on selection response are needed (for 
example, see R£F.«). 

A more general point about the statistical aspects of 
MAS is that the existing models and theory do not 
adequately accommodate the more complex genetics 
that underlies quantitative traits. Furthermore, 
although existing quantitative genetic theory provides 
a satisfactory basis to derive selection strategies that 
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maximize response to selection in the short term (one 
or two generations), the theory has been much less 
developed for selection over several generations. This 
was most clearly seen in several simulation studies that 
showed that combined selection on an index of molec- 
ular score and phenolype results in greater genetic gain 





Muftpte ovulation and embryo transfer 




Pre* selection ten market M and piogeny testng 





Figue 3 1 Marker- assisted pre- selection for progeny testing. Because mrk production is a 
sex-rwr»led 1 1 art, dairy bute go thjeugh a pogeny test . in which they are evaluated on (he basis of 
the milk poducbon of 60-1 00 daughters. Attet the pogeny test, the best bufs ae setected tor 
widespead use h the popJation through artificial hsemination. Because of the high cost 
rnvolved. onry a trmited number of buls can be pogeny tested each year. Selection ol buls to be 
tested is based on ancestral Herniation. wNch means that afl members of a M-sto famfy have 
the same estimated breeclng value. Molecular scores wi). however, differ between tul-sbs if I hey 
inherited drier ent marker afletes. Though repctductive technology, such as multiple ovulation and 
embryo Oansfer. several but carves are poduced pa ferrate and seteclion of buls to pogtny test 
can be on the basis ot moteciiar score 3 ' *\ The ccntoination ol nwrker -assisted pe setection 
and pogeny testng has a g eater chance of poduciig Ndnly poductive animals 



in the short term; but, in the long term, selection on 
phenotype aione resulted in a greater response to selec- 
tion 4 *- 4 *, because selection is better distributed over all 
loci 47 . A theory to optimize selection on molecular 
score, in combination with phenotype, has been devel- 
oped* -50 , but for genetic models and selection strate- 
gies of limited com plexity. Further theoretical work is 
needed to accommodate multilocus Mendelian inheri- 
tance and phenomena such as epislasis, genetic back- 
ground effects and interactions between the environ- 
ment and genetics. 

Redesign of breeding programmes. Most applications 
of molecular genetics to breeding programmes have 
attempted to incorporate molecular data into the 
existing programmes. The effective use of molecular 
data might, however, require a complete redesign of 
breeding programmes. For example, in plants, the 
optimal design for MAS is to allocate lest resources to 
a single, large population, such that the probability of 
detecting QTL is high, whereas for phenotypic selec- 
tion, the optimum is to have smaller populations in 
several locations to control for environmental varia- 
tion M . In addition, population structures and statisti- 
cal methods that allow the combination and use of 
QTL information across lines are needed. Other 
changes that are required for plant breeding pro- 
grammes are reviewed by Ribaut and Hoisington". 
Similarly, in animals, strategies are required that inte- 
grate the collection and analysis of phenotypic data 
for QTL detection with the use of this information for 
MAS {for example, ref.37). 

Furthermore, breeding strategies must be devel- 
oped that take better advantage of the unique features 
of molecular data. For example, to capitalize on the 
ability to select on molecular score at an early age, sev- 
eral rapid rounds of selection exclusively on molecular 
score could be conducted. The speed of selection is 
then mainly limited by the reproductive cycle. Such 
programmes have been proposed for plants by 
Hospital et o/. 47 , by incorporating one or two genera- 
tions of off- season selection on molecular score alone, 
and have been shown (by simulation) to increase 
genetic gain greatly. In animals, such strategies are 
effective only if combined with technologies that break 
the normal reproductive cycle. For example, in several 
livestock species, the technology exists to recover 
oocytes from the female before puberty, as early as 
from the unborn fetus. When combined with in vitro 
fertilization and embryo transfer, this reduces genera- 
tion intervals to several months, compared with at least 
3 years with regular reproduction in cattle". Haley and 
Visscher* suggested that the lime required for one 
generation could be further reduced if metosis could 
be conducted in vitro. Such technology, combined with 
nuclear transfer, would allow a breeding programme to 
be conducted in the laboratory, without creating ani- 
mals. Although some of this work is at an early stage, it 
is clear that the benefits of MAS will be much greater 
when molecular technology is integrated with repro- 
ductive technologies. 
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figure 4 | introgreeaon of the avian nakod nock gene. The autosomal naked neck gene, 
which affects feather dstrbution in chickens and makes them more tolerant to heat, was 
inUocressed from rua! low- body- weigh* donor chickens (two smalt birds) into a commercial 
meat- type Cornish chicken recipient Ine (two large, white birds) 23 . Genome- wide markers were v 
used to enhance recovery of the recipient fine genome, which conveys rapid gowlh and high 
body weight. Picture is courtesy of A. Cahaner. The Hebrew Urwersity, Rehovol, Israel. 



The need to fine map quantitative trait loci. The ulti 
mate aim of molecular genetic studies of quantitative 
genetic variation is to find the genes that influence the 
trait. However, the use of MAS does not require the gene 
to be known, but can be effective with linked markers. 
So, the crucial issue is how closely a QTL must be 
mapped for it to be useful for MAS. 

SeveraJ simulation studies have shown that for MAS 
based on within- family LD, informative markers that 
flank a QTL within 5 cM seem adequate' 6 . Given that 
markers are not fully informative in practice, this can be 
achieved by using hafiotypes of several markers within a 
10-cM region around the QTL. For example, Spelman 
and Bovenhuis" found that a flanking marker interval 
of 5 cM around the QTL achieved -85-90% of the extra 
response over selection without markers, relative to a 
flanking marker interval of 2 cM. 

Although further fine mapping of QTL might pro- 
vide limited benefits for MAS based on within-family 
LD, the occurrence of population -wide LD will increase 
substantially if the markers are more lightly linked to 
the QTL Selection on markers that are in population- 
wide LD with QTL is much preferred because QTL 
effects and linkage phase can be estimated from popula- 
tion-wide data instead of the b mi ted data that would be 
available within a family*. For individual QTL, markers 
or marker haplotypes within I or 2 cM of the causative 
locus might be required for substantial population- wide 
LD to be present, depending on population size and 
selection history 57 . 

LD can be exploited at a genome- wide level when 
marker data are available from a high-density marker 
map; for example, with a marker every centiMorgan. 
The potential of using such data was illustrated by 
Meuwissen c t ai }9 , who simulated genome-wide data 



HAPLOTYPE 
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Ira* locus wnh alleles Q and q, 
possible haplotypes are MO, 
Mq.mQandmq. 
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for a breeding population based on the historical accu- 
mulation of mutations (which gives rise to QTL) at 
locations throughout the genome in the context of a 
high- density marker map. They then computed molec- 
ular scores based on statistical associations of pheno- 
type with marker haplotypes to capture population- 
wide LD. For populations that are representative of 
livestock with an ErrrcrrvE population size of 100, they 
showed that sufficient LD was available and that the 
molecular score had an accuracy of 85% as a predictor 
of the total genetic value of an individual, when marker 
spacing was 1 cM. Accuracy dropped to 81 and 74%, 
respectively, for marker spacings of 2 and 4 cM. 

Fine mapping of QTL wiD also increase the efficiency 
of foreground selection in introgression programmes 
because the genomic region that has to be controlled is 
smaller. This will reduce the number of individuals that 
are required and the genotyping cost In addition, intro- 
gression of a smaller genomic region helps to eliminate 
unwanted genes that are located around the target QTL. 
This is particularly important when the donor is an 
exotic genetic resource- Similar considerations also hold 
true for recurrent MAS. 

So, the extensive resources that are required to fine 
map QTL, let alone clone the functional gene, wiU bene- 
fit genetic improvement programmes only to a degree. 
More detailed knowledge of the functional genes would, 
however, allow a better understanding of the physiology 
of the quantitative trait. This might allow better predic- 
tion of the effects of the QTL in different genetic back- 
grounds and environmental conditions, and on different 
characteristics of performance. In addition, specific 
management strategies could be developed for specific 
genotypes to enhance their performance. 

The economics off marker-assisted selection 

Economics is the key determinant for the application of 
molecular genetics in genetic improvement pro- 
grammes. The use of markers in selection incurs the 
costs that are inherent to molecular techniques. Apart 
from the cost of QTL detection, which can be substan- 
tial, costs for MAS include the costs of DNA collection, 
genotyping and analysis. The economic assessment of 
MAS is straightforward in some cases, but complex in 
others (TABLE l), and has been addressed in few studies 
(for example, REFS 37,st, 56,59). These studies have relied 
primarily on genetic and economic modelling because 
the results are extremely difficult to verify using repli- 
cated experiments. 

Cases in which the economic merit of MAS is clear 
include situations in which molecular costs are more 
than offset by the savings in phenol ypic evaluation. 
Examples are the use of markers in genotype building 
programmes, and selection on markers that are in pop- 
ulation-wide LD for traits that are costly to evaluate 
(for example, disease resistance and meat-quality traits 
in animals). In other cases, the ability to select earfy off- 
sets the extra costs that are associated with MAS. The 
benefits of being able to release new genetic material 
more quickly can be substantial, particular!)' in com- 
petitive markets. 
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The economic merit of MAS becomes questionable 
and more difficult to evaluate in cases in which MAS is 
expected to provide greater genetic gain at increased 
costs. This is particularly the case for selection schemes 
that reJy on a combination of phenotype and molecular 
score, because molecular costs are in addition to, not in 
place of, phenotypic costs. In such cases, MAS might 
not be economically more advantageous than quantita- 
tive genetic selection, although the economic merit of 
MAS could be restored by reducing the frequency of re- 
evaluation of marker effects 47 . Another consideration is 
that the resources allocated to MAS could also be allo- 
cated to enhance phenotypic selection programmes. 
For example, improvement by conventional selection 
could also be enhanced by increasing the number of 
individuals that are tested for phenotypic evaluation 51 . 
Further work on the economic evaluation and opti- 
mization of strategies for the use of molecular genetics 
in breeding programmes is required. It is likely that the 
economically optimal use of MAS necessitates a com- 
plete re- think of the design of breeding schemes, as 
described in the previous section. 

Conclusions 

Genetic improvement programmes for livestock and 
crop species can be enhanced by the use of molecular 
genetic information in int regression, genotype building 
and recurrent selection programmes. The prospects for 
MAS are greatest for traits that are difficult to improve 
through conventional means, because of low heritability 
or the difficulty and expense of recording phenotype. 
Recurrent selection using linked markers can be effective 
and does not require identification of the functional 
mutations, although some level of fine mapping is 
required, in particular to capitalize on population- wide 
LD. The identiScation and use of linked markers is based 
on empirical relationships with phenotype, and is, there- 



fore, also Bmited to some degree by the heritabiliry of the 
trait and the availability of phenotypic data. Phenotypic 
data requirements are lower with the use of population- 
wide IX) than with the use of within- family LD. 

Unless genetic markers capture most of the genetic 
variation for the trait, which is far from the case at pre- 
sent, selection must be based on a combination of 
marker and conventional phenotypic data. Although 
several useful genes (primarily linked genetic markers) 
have been identified in livestock and crop species, their 
application has been limited and their success inconsis- 
tent, because the genes were not identified in breeding 
populations, or because they interact with other genes 
or the environment. The most effective use of markers 
has been in introgression programmes in plants. Further 
use of MAS might require a substantial redesign of 
breeding programmes, in combination with other tech- 
nologies, such as those associated with reproduction. 

Further advances in molecular technology' and 
genome programmes wiD soon create a wealth of infor- 
mation that can be exploited for the genetic improve- 
ment of plants and animals. High- throughput genotyp- 
ing, for example, will allow direct selection on marker 
information based on population-wide LD. Methods to 
effectively analyse and use this information in selection 
are still to be developed. The eventual application of 
these technologies in practical breeding programmes will 
be on the basis of economic grounds, which, along with 
cost-effective technology, wiD require further evidence of 
predictable and sustainable genetic advances using MAS. 
Until complex trails can be fully dissected, the applica- 
tion of MAS will be limited to genes of moderate -to- 
large effect and to applications that do not endanger the 
response to conventional selection. Until then, observ- 
able phenotype will remain an important component of 
genetic improvement programmes, because it takes 
account of the collective effect of all genes. 
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